智能论文笔记

Recommendation as Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5)

Shijie Geng , Shuchang Liu , Zuohui Fu , Yingqiang Ge , Yongfeng Zhang

分类：人工智能 | 自然语言处理 | 机器学习

2022-03-24

长期以来，不同的推荐任务通常需要设计特定于任务的架构和培训目标。结果，很难将学习的知识和表示从一个任务转移到另一个任务，从而限制了现有推荐方法的概括能力，例如，几乎无法将顺序推荐模型应用于审核生成方法。为了解决此类问题，考虑到语言几乎可以描述任何内容，语言基础是表示各种问题或任务的有力媒介，我们提出了一种灵活而统一的文本到文本范式，称为“预绘，个性化的提示和预测范式” （P5）为了推荐，该建议在共享框架中统一了各种建议任务。在P5中，将所有数据（例如用户项目交互，用户描述，项目元数据和用户评论）转换为通用格式 - 自然语言序列。来自自然语言的丰富信息有助于P5捕获更深入的语义，以进行个性化和建议。具体而言，P5在预处理过程中以相同的语言建模目标学习不同的任务。因此，它是各种下游建议任务的基础模型，可以轻松地与其他模式集成，并根据提示启用基于指导的建议。 P5将推荐系统从浅层模型到深模型到大型模型，并将彻底改变推荐系统的技术形式，向通用推荐引擎。借助对不同用户的自适应个性化提示，P5能够以零拍或几种方式进行预测，并大大减少了进行广泛微调的必要性。在几个建议基准中，我们进行实验以显示P5的有效性。我们以\ url {https://github.com/jeykigung/p5}发布源代码。

translated by 谷歌翻译

Personalized Counterfactual Fairness in Recommendation

Yunqi Li , Hanxiong Chen , Shuyuan Xu , Yingqiang Ge , Yongfeng Zhang

分类：人工智能 | 机器学习

2021-05-20

由于越来越多的用户使用它们来寻求和决策，推荐制度对人类和社会的影响增加了对人类和社会的影响。因此，在建议中解决潜在的不公平问题至关重要。就像用户在物品上具有个性化的偏好，用户对公平性的要求也是个性化的许多情况。因此，为用户提供个性化的公平建议，以满足其个性化的公平需求。此外，以前的公平建议作品主要关注基于关联的公平性。但是，重要的是从联合公平概念前进，以便在推荐系统中更适当地评估公平性的因果公平概念。本文根据上述考虑，侧重于为推荐系统中的用户实现个性化的反事实公平。为此，我们介绍了一个框架，通过对建议产生特征 - 独立的用户嵌入来实现通过对抗学习来实现反转公平的建议。该框架允许推荐系统为用户实现个性化的公平，同时也涵盖非个性化情况。在浅层和深刻的推荐算法上的两个现实数据集的实验表明，我们的方法可以为具有理想的推荐性能的用户生成更公平的建议。

translated by 谷歌翻译

CORGI-PM: A Chinese Corpus For Gender Bias Probing and Mitigation

Ge Zhang , Yizhi Li , Yaoyao Wu , Linyuan Zhang , Chenghua Lin , Jiayi Geng , Shi Wang , Jie Fu

分类：自然语言处理 | 人工智能 | 机器学习

2023-01-01

As natural language processing (NLP) for gender bias becomes a significant interdisciplinary topic, the prevalent data-driven techniques such as large-scale language models suffer from data inadequacy and biased corpus, especially for languages with insufficient resources such as Chinese. To this end, we propose a Chinese cOrpus foR Gender bIas Probing and Mitigation CORGI-PM, which contains 32.9k sentences with high-quality labels derived by following an annotation scheme specifically developed for gender bias in the Chinese context. Moreover, we address three challenges for automatic textual gender bias mitigation, which requires the models to detect, classify, and mitigate textual gender bias. We also conduct experiments with state-of-the-art language models to provide baselines. To our best knowledge, CORGI-PM is the first sentence-level Chinese corpus for gender bias probing and mitigation.

translated by 谷歌翻译

Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits

Ruibo Liu , Chenyan Jia , Ge Zhang , Ziyu Zhuang , Tony X Liu , Soroush Vosoughi

分类：自然语言处理 | 人工智能

2023-01-01

We present Second Thought, a new learning paradigm that enables language models (LMs) to re-align with human values. By modeling the chain-of-edits between value-unaligned and value-aligned text, with LM fine-tuning and additional refinement through reinforcement learning, Second Thought not only achieves superior performance in three value alignment benchmark datasets but also shows strong human-value transfer learning ability in few-shot scenarios. The generated editing steps also offer better interpretability and ease for interactive error correction. Extensive human evaluations further confirm its effectiveness.

translated by 谷歌翻译

VertMatch: A Semi-supervised Framework for Vertebral Structure Detection in 3D Ultrasound Volume

Hongye Zeng , kang Zhou , Songhan Ge , Yuchong Gao , Jianhao Zhao , Shenghua Gao , Rui Zheng

分类：计算机视觉

2022-12-28

Three-dimensional (3D) ultrasound imaging technique has been applied for scoliosis assessment, but current assessment method only uses coronal projection image and cannot illustrate the 3D deformity and vertebra rotation. The vertebra detection is essential to reveal 3D spine information, but the detection task is challenging due to complex data and limited annotations. We propose VertMatch, a two-step framework to detect vertebral structures in 3D ultrasound volume by utilizing unlabeled data in semi-supervised manner. The first step is to detect the possible positions of structures on transverse slice globally, and then the local patches are cropped based on detected positions. The second step is to distinguish whether the patches contain real vertebral structures and screen the predicted positions from the first step. VertMatch develops three novel components for semi-supervised learning: for position detection in the first step, (1) anatomical prior is used to screen pseudo labels generated from confidence threshold method; (2) multi-slice consistency is used to utilize more unlabeled data by inputting multiple adjacent slices; (3) for patch identification in the second step, the categories are rebalanced in each batch to solve imbalance problem. Experimental results demonstrate that VertMatch can detect vertebra accurately in ultrasound volume and outperforms state-of-the-art methods. VertMatch is also validated in clinical application on forty ultrasound scans, and it can be a promising approach for 3D assessment of scoliosis.

translated by 谷歌翻译

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation

Jay Zhangjie Wu , Yixiao Ge , Xintao Wang , Weixian Lei , Yuchao Gu , Wynne Hsu , Ying Shan , Xiaohu Qie , Mike Zheng Shou

分类：计算机视觉

2022-12-22

To reproduce the success of text-to-image (T2I) generation, recent works in text-to-video (T2V) generation employ large-scale text-video dataset for fine-tuning. However, such paradigm is computationally expensive. Humans have the amazing ability to learn new visual concepts from just one single exemplar. We hereby study a new T2V generation problem$\unicode{x2014}$One-Shot Video Generation, where only a single text-video pair is presented for training an open-domain T2V generator. Intuitively, we propose to adapt the T2I diffusion model pretrained on massive image data for T2V generation. We make two key observations: 1) T2I models are able to generate images that align well with the verb terms; 2) extending T2I models to generate multiple images concurrently exhibits surprisingly good content consistency. To further learn continuous motion, we propose Tune-A-Video with a tailored Sparse-Causal Attention, which generates videos from text prompts via an efficient one-shot tuning of pretrained T2I diffusion models. Tune-A-Video is capable of producing temporally-coherent videos over various applications such as change of subject or background, attribute editing, style transfer, demonstrating the versatility and effectiveness of our method.

translated by 谷歌翻译

Pay Attention to Your Tone: Introducing a New Dataset for Polite Language Rewrite

Xun Wang , Tao Ge , Allen Mao , Yuki Li , Furu Wei , Si-Qing Chen

分类：自然语言处理

2022-12-20

We introduce \textsc{PoliteRewrite} -- a dataset for polite language rewrite which is a novel sentence rewrite task. Compared with previous text style transfer tasks that can be mostly addressed by slight token- or phrase-level edits, polite language rewrite requires deep understanding and extensive sentence-level edits over an offensive and impolite sentence to deliver the same message euphemistically and politely, which is more challenging -- not only for NLP models but also for human annotators to rewrite with effort. To alleviate the human effort for efficient annotation, we first propose a novel annotation paradigm by a collaboration of human annotators and GPT-3.5 to annotate \textsc{PoliteRewrite}. The released dataset has 10K polite sentence rewrites annotated collaboratively by GPT-3.5 and human, which can be used as gold standard for training, validation and test; and 100K high-quality polite sentence rewrites by GPT-3.5 without human review. We wish this work (The dataset (10K+100K) will be released soon) could contribute to the research on more challenging sentence rewrite, and provoke more thought in future on resource annotation paradigm with the help of the large-scaled pretrained models.

translated by 谷歌翻译

DocAsRef: A Pilot Empirical Study on Repurposing Reference-Based Summary Quality Metrics Reference-Freely

Forrest Sheng Bao , Ruixuan Tu , Ge Luo

分类：人工智能 | 自然语言处理

2022-12-20

Summary quality assessment metrics have two categories: reference-based and reference-free. Reference-based metrics are theoretically more accurate but are limited by the availability and quality of the human-written references, which are both difficulty to ensure. This inspires the development of reference-free metrics, which are independent from human-written references, in the past few years. However, existing reference-free metrics cannot be both zero-shot and accurate. In this paper, we propose a zero-shot but accurate reference-free approach in a sneaky way: feeding documents, based upon which summaries generated, as references into reference-based metrics. Experimental results show that this zero-shot approach can give us the best-performing reference-free metrics on nearly all aspects on several recently-released datasets, even beating reference-free metrics specifically trained for this task sometimes. We further investigate what reference-based metrics can benefit from such repurposing and whether our additional tweaks help.

translated by 谷歌翻译

ColoristaNet for Photorealistic Video Style Transfer

Xiaowen Qiu , Ruize Xu , Boan He , Yingtao Zhang , Wenqiang Zhang , Weifeng Ge

分类：计算机视觉 | 机器学习

2022-12-19

Photorealistic style transfer aims to transfer the artistic style of an image onto an input image or video while keeping photorealism. In this paper, we think it's the summary statistics matching scheme in existing algorithms that leads to unrealistic stylization. To avoid employing the popular Gram loss, we propose a self-supervised style transfer framework, which contains a style removal part and a style restoration part. The style removal network removes the original image styles, and the style restoration network recovers image styles in a supervised manner. Meanwhile, to address the problems in current feature transformation methods, we propose decoupled instance normalization to decompose feature transformation into style whitening and restylization. It works quite well in ColoristaNet and can transfer image styles efficiently while keeping photorealism. To ensure temporal coherency, we also incorporate optical flow methods and ConvLSTM to embed contextual information. Experiments demonstrates that ColoristaNet can achieve better stylization effects when compared with state-of-the-art algorithms.

translated by 谷歌翻译

Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?

Runpei Dong , Zekun Qi , Linfeng Zhang , Junbo Zhang , Jianjian Sun , Zheng Ge , Li Yi , Kaisheng Ma

分类：计算机视觉

2022-12-16

The success of deep learning heavily relies on large-scale data with comprehensive labels, which is more expensive and time-consuming to fetch in 3D compared to 2D images or natural languages. This promotes the potential of utilizing models pretrained with data more than 3D as teachers for cross-modal knowledge transferring. In this paper, we revisit masked modeling in a unified fashion of knowledge distillation, and we show that foundational Transformers pretrained with 2D images or natural languages can help self-supervised 3D representation learning through training Autoencoders as Cross-Modal Teachers (ACT). The pretrained Transformers are transferred as cross-modal 3D teachers using discrete variational autoencoding self-supervision, during which the Transformers are frozen with prompt tuning for better knowledge inheritance. The latent features encoded by the 3D teachers are used as the target of masked point modeling, wherein the dark knowledge is distilled to the 3D Transformer students as foundational geometry understanding. Our ACT pretrained 3D learner achieves state-of-the-art generalization capacity across various downstream benchmarks, e.g., 88.21% overall accuracy on ScanObjectNN. Codes will be released at https://github.com/RunpeiDong/ACT.

translated by 谷歌翻译